Skip to content

GH-121970: Extract pydoc_topics into a new extension#129116

Merged
AA-Turner merged 8 commits intopython:mainfrom
AA-Turner:docs/pydoc-topics
Jan 21, 2025
Merged

GH-121970: Extract pydoc_topics into a new extension#129116
AA-Turner merged 8 commits intopython:mainfrom
AA-Turner:docs/pydoc-topics

Conversation

@AA-Turner
Copy link
Member

@AA-Turner AA-Turner commented Jan 21, 2025

This also simplifies the pydoc-topics builder. Grouping the topic labels by docname improves the speed of topics generation from ~68s to ~57s (19% faster) from a cold state, and from ~13s to ~3.4s (3.8x faster) when re-using the pickled documents.

The representation of topics.py also changes from the default pprint.pformat output of:

topics = {'assert': 'The "assert" statement\n'
           '**********************\n'
           '\n'
           'Assert statements are a convenient way to insert debugging '
           'assertions\n'
           'into a program:\n'

to a simpler representation using triple single quotes (save for when ''' appears in the body):

topics = {
    'assert': r'''The "assert" statement
**********************

Assert statements are a convenient way to insert debugging assertions
into a program:
'''

This representation is both nicer to read and is 63% of the file size of the current topics.py (518KB vs 830KB). Line count also decreases from 17,486 to 12,782.

Tested by running:

>>> import runpy
>>> topics_old = runpy.run_path("Doc/build/topics_old.py")['topics']
>>> topics_new = runpy.run_path("Doc/build/topics_new.py")['topics']
>>> assert list(topics_old) == list(topics_new) # check order
>>> [k for k in topics_old if topics_old[k] != topics_new[k]]
['debugger', 'formatstrings']

The 'formatstrings' change is trailing whitespace on the >>> for num in range(5,12): line:

>>> fs_old_stripped = '\n'.join(map(str.rstrip, topics_old['formatstrings'].splitlines()))
>>> fs_new_stripped = '\n'.join(map(str.rstrip, topics_new['formatstrings'].splitlines()))
>>> assert fs_old_stripped == fs_new_stripped

The 'debugger' change is "Ctrl-C" to "Ctrl"-"C", but I'm not sure what caused this:

>>> print('\n'.join(difflib.unified_diff(topics_old['debugger'].splitlines(), topics_new['debugger'].splitlines())))
--- 
+++ 
@@ -186,9 +186,9 @@
    originate in a module that matches one of these patterns. [1]
 
    By default, Pdb sets a handler for the SIGINT signal (which is sent
-   when the user presses "Ctrl-C" on the console) when you give a
+   when the user presses "Ctrl"-"C" on the console) when you give a
    "continue" command. This allows you to break into the debugger
-   again by pressing "Ctrl-C".  If you want Pdb not to touch the
+   again by pressing "Ctrl"-"C".  If you want Pdb not to touch the
    SIGINT handler, set *nosigint* to true.

cc @hugovk as 3.14 release manager as this does change the format of Lib/pydoc_data/topics.py. I'm happy when doing backports to preserve the current format (pformat) if release managers would prefer.

A


📚 Documentation preview 📚: https://cpython-previews--129116.org.readthedocs.build/

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Documentation in the Doc dir skip news

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants